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Abstract 


Pixel lensing is a technique used to search for baryonic components 
of dark matter (MACHOs) and allows to detect microlensing events even 
when the target galaxies are not resolved into individual stars. Potentially, 
it has the advantage to provide higher statistics than other methods but, 
unfortunately, traditional approaches to pixel lensing are very demand¬ 
ing in terms of computing time. We present the new, user friendly, tool 
MEDEA (Microlensing Experiment Data-Analysis Software for Events 
with Amplihcation). The package can be used either in a fully automatic 
or in a semi-automatic mode and can perform an on-line identihcation of 
events by means of a two levels trigger and a quasi-on-line data analysis. 
The package will find application in the exploration of large databases as 
well as in the exploitation of specifically tailored future surveys. 
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1 Introduction 


During the last decade, much attention has been devoted to the detection of 
MACHOs (Massive Astrophysical Compact Halo Objects). Several teams have 
shown that microlensing can be successfully used to detect the slight changes in 
the luminosity of background stars caused by the passage of a massive deflector 
interposed along (or close) the line of sight (Paczynski 1986, Alcock et al. 1993, 
Auborg et al. 1993, Udalski et al. 1993). 

The pixel lensing method was proposed and implemented by the AGAPE 
Collaboration to monitor simultaneously large numbers of stars and therefore 
increase the probability to detect the intrinsically rare microlensing events (Bail- 
Ion et al. 1993, Ansari et al. 1997). A similar technique, based on image sub¬ 
traction, was used by the VATT-Columbia Collaboration (Tomaney and Crotts 
1994, Tomaney 1996). Both techniques were successfully tested; recently the dis¬ 
covery of new candidate events towards the Andromeda galaxy was announced 
(Auriere et al. 2001, Calchi Novati et al. 2002). Additional microlensing candi¬ 
dates towards Andromeda were found by Crotts (2000). 

In pixel lensing, the search for microlensing events is performed by moni¬ 
toring the light curve of individual image pixels rather than that of individual 
stars. Such approach, however, while ensuring an enormous gain in the statis¬ 
tics, poses paramount problems in resolving the lensed sources. From a com¬ 
putational point of view, pixel lensing requires a large number of floating point 
operations on each individual pixels. This leads to very large computational 
loads and usually prevents the real-time detection (and possible follow-up’s) of 
the candidate events. In this paper we describe the new tool MEDEA - already 
outlined in a preliminary and much less complete version in (Capozziello and 
lovane 1999) - which, by making use of advanced data mining techniques, allows 
the real time processing of pixel data. 

This paper is organized as follows. In the Sect.2, we give the scientific 
background of the microlensing theory and of the pixel lensing technique. Sect.3 
is devoted to a general description of the MEDEA environment. Sect.4 describes 
the Data Pre-Processing and Sect.5 the Data Processing. The analysis of the 
data is discussed in Sect.6, while some conclusions are drawn in Sect.7. 


2 Microlensing and Pixel Lensing Technique 


If a massive object acting as a gravitational lens on a background source is 
sufficiently close to the line of sight of the observer, the light is deflected by 
an angle which is usually too small to produce observable multiple images and 
we can observe only a magnification of the flux coming from the source. The 
magnification factor A{t) is given by: 


A(t) = + ^ 

u{t)^Ju^{t) -1-4’ 


( 1 ) 
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where u{t) = OsDql/Re, dg is the angle between the optical axis and the 
direction of the source, Dql is the distance between the observer and the lens, 


Re = Dql * Oe = 


AGM DlsDql 


is the Einstein radius, Dls is the distance 


Dos 

between the lens and the source; Dos is the distance between the observer and 
the source. These events, also known as microlensing, are characterized by three 
main features: 


- uniqueness of the event: the probability that a microlensing events 
occurs twice on the same star is assumed to be zero; 

- symmetry of the light curve: the light curve (in point-like model of 
lens and source) has to be symmetric about the maximum magnification 
point; 


achromaticity of the event: the ratio between the flux variation in 


different colours, 
Falco 1992). 




our^l 


^4^col 


ourJ2 


is constant in time (Schneider, Ehlers and 


It is useful to stress that a differential amplihcation of extended sources can give 
rise to a chromatic, but still symmetric, lensing curve (Han et al. 2000). To 
study the case of pixel lensing, let us start from defining the flux inside a fixed 
pixel as: 


(j)P = (j)* + 4>bkg + n, (2) 

where (f>* is the photon flux of the star ” at rest” that will be lensed, is the 
photon flux coming from neighboring stars falling in the same pixel, and n is 
the noise. A microlensing event implies a variation of (fiP with time: 

4^ = 4>bkg + {A(t) -1) -(j)* +n. (3) 


3 The Environment 

MEDEA is structured as follows (see Fig.l): (lovane 2002)|^ 

i) The Advanced Data Acquisition (A-DAQ) Unit: responsible for the data 
acquisition and pre-reduction. 

ii) The Control Unit (CU) piloting the Telescope Control System (TCS) which 
drives the telescope following the instructions provided either by the Data Base 
Control System (DECS) or by the user. 

iii) The DataBase (DB) Unit: it is the ’’intelligent” part of the system where 
the data are stored and processed according to the simulations or previous 
observations (Jordan 1997). 

^Winner of the Price "Best Application of Measure and Automation in Europe 2001-2002” 
(Assigned by National Instruments, Italy), 28 February 2002. 
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iv) The Processing and Analyzing (P&A) Unit: platform where massive data 
analysis is performed. 

v) Dispatcher Unit (DU): which automatically builds a status report on the 
different phases of the data flow. More in detail: statistics, plots of data and 
events are produced and stored by this module. Moreover, in the occurrence of 
special events (such as an alert or failure of the system, or a short microlensing 
event, for which a quick answer is needed) this unit can reach and alert people 
in automatic mode thanks to an e-mail service, SMS (Short Message System) 
and Fax Messages. 

The Processing and Analyzing Unit consists of three main units: the Data 
Pre-Processing Unit (DAPP) for astrometric alignment and for photometric 
and seeing correction; the Data Processing Unit (DAP) for peak detection of 
relevant luminosity variation; the Data Analysis Unit (DAU) for best fits, color 
correlations, tests, Kolmogorov - Smirnov tests. The A-DAQ Unit and DAPP 
Unit are organized in a fully automatic, non interactive and on line modality. A 
First Trigger Level (for selecting luminosity variations trough a peak detection 
algorithm) is the most relevant component of DAP Unit. It consists of four 
sections: a) peak detection procedure, b) star detection and filtering algorithm, 
c) cosmic rays filter, and d) peak classification (single, double, multiple peak 
curves) routines. Also the DAP Unit operates in real time, thus implying that 
the whole data reduction is performed while the next set of data is acquired. 
The second trigger level, corresponding to the selection of microlensing events, is 
inside the DAU. During this phase, we also test whether the measured luminosity 
curves are compatible or not with simple lensing models^ 

The events which pass the first trigger level and are incompatible with the 
simple models included in the second trigger level are studied off-line by means 
of interactive procedures. This study is relevant in order to understand those 
events which are produced by complex lensing phenomena (such as double de¬ 
flector, planetary system and so on), variable stars or novae and supernovae. 

4 Data Pre-Processing 

The Data Pre-Processing Unit (DAPP) is composed by three software modules: 
Astrometric Alignment Unit, Photometric Correction Unit, Seeing Correction 
Unit. 

4.1 Astrometric Alignment 

In the astrometric alignment module, we find three hierarchical levels: the first 
one is responsible of the data I/O and controls the user interface, the second one 
controls the learning of a properly selected part of the reference image which 

^Such models are single point-like source and single point-like deflector model, double point¬ 
like source and single point-like deflector model, extended source with constant brightness and 
single point-like deflector model. 
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will be compared with the other images to be aligned, while the last object 
performs the astrometric alignment according to the well known transformation 



At the second level, it is possible to define a dimension for the images on which 
the system has to learn the pattern. In other words, the program fixes the size 
of the calibration windows as a function of both the density of the field and of 
its auto-similarity. Of course, it is possible to use the full frame, but this could 
turn out to be computationally too heavy. The structure of the learning phase 
is the following: 

• an automatic threshold filters the pixels having counts above a fixed tresh- 
old 4>th (with (j)th = (j) + 3(t); 

• a clustering procedure builds pixel clusters and rejects those clusters which 
are either too large or too small (compared to a test performed on the 
pixel area), or have a very irregular shape (evaluated against the inertial 
momenta of the pixel cluster); 

• the evaluation of the center of mass for each surviving cluster and of the 
corresponding pattern considered for the astrometric calibration. 

In order to perform pattern matching, we implemented a tool (Image Advanced 
Interpreter Matching) which incorporates image understanding techniques to 
interpret the template information, and then uses this information to recognize 
the template in the image. In order to generate information about the features 
of a template image we have used the following information source: a) geometric 
modeling of images; b) effective non-uniform sampling of images; c) extraction 
of template information. 

This algorithm is built to fulfill the following main functions: 

1. Edge detection and clusters selection; 

2. Evaluation of geometrical parameters Xi'. cluster’s area (in pixel), 
cluster’s perimeter, number of holes in each cluster, hole’s area (in pixel), hole’s 
perimeter, cluster’s inertial tensor (where the luminosity plays the role of the 
mass); 

3. Computation of a linear function of the previous parameters for each 
cluster, accordingly to the formula: 


/Cluster ^ ^ 


where ai = — =. 

y {xi - xif 

The calibration is connected with the maximum correlation between images. 


trough /Cluster ^ Other words, we minimize: 


c = Ysn 
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( 5 ) 
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where j is the image index, while i and I are the cluster indexes. We evaluate 
the coefficients aij and Ax, Ay of eq.(4) by using a number of cluster larger than 
the number of calibration parameters, in correspondence of the minimum value 
of C. This approach reduces the amount of information needed to characterize 
completely an image or pattern, thus speeding up the searching process. 

4.2 Photometric Correction 

Changes in the observing conditions are locally corrected with respect to the 
reference image which, by definition, we assume to be the one with the best see¬ 
ing. The algorithm, which was implemented taking into account the possibility 
of parallel computing, corrects local fluctuations of the flux due to gradients 
in the image (e.g. the effect the lunar stray light). The image is first divided 
in cells then, in addition to the pixel coordinates Xi and y^, we have two cell 
indexes, Xframe and yframe- Moreover, we have an index to select the image, 
I. In this way, in order to select the luminosity of a pixel in an image of the set 
we have to assign a vector with five components: (/, Xframe, y frame, xt, yi) . 

The mean flux value and the standard deviation are evaluated for each cell. 
Noisy pixels, cosmic rays and bad pixels are rejected and then the mean flux and 
the standard deviation are again evaluated and stored. After these operations, 
we impose that the mean value of the flux must be equal on the compared cells. 

This method does not work properly for pixels lying on the edges of the 
cellsQ. If at the end of the analysis there is a pixel on the edge of a cell with 
an interesting light curve, a specific photometric alignment is performed off-line 
by posing the relevant pixel at the center of a new cell and then iterating the 
above described procedure. 

In the first prototype of the pipeline, we implemented a bi-dimensional inter¬ 
polation on edges, but the signal turned out to be diluted by the interpolation 
process thus imposing the introduction of the cell approach. Fig. 2, 3, and 4 
show the reference image, the current image, and the current image after the 
photometric correction. 

Moreover, we find the grey level histograms before and after photometric 
calibration in Fig.5 and 6. 

4.3 Seeing Correction 

The real time (in a post processing point of view) correction of the seeing is a 
relevant issue (cf. Sedmak 1999). The analysis is performed on a cluster of pixels 
(superpixel) with a size large enough to cover the PSF. In the standard pixel 
leasing data reduction (Ansari 1997, Le Du 2000), the superpixel size is chosen 
to be large enough to cover the worse PSF and is the same for all images. Here 
we dynamically select the dimension of the superpixel with respect to the PSF 

®It is relevant to choose an optimized cell size in order to obtain a good compromise between 
the field of view and the number of pixels near the edges of the cells. For example, given an 
image taken on Andromeda at a distance of 690 Kpc with a CCD camera with 1024 X 1024 
pixels, a good cell size is one order smaller than the original image size (i.e. about 100 X 100). 
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measured on some calibrators in the fields (i.e. for each image corresponding 
to a given observation). The seeing variation factor is evaluated and defined as 
the ratio of the area in pixels of selected clusters between a reference image and 
another one. 

This number becomes the input to build the kernel kij of a morphing algo¬ 
rithm that in our case is a dilation algorithm. J^hen if K is the kernel and P is 
the pixels matrix, the seeing corrected image P is: 


EE {KS ® P^) 

a b 

a b 


with 


a S [i — l,i + 1] 
b € [j — m,j + m] ’ 


( 6 ) 


where I x m is the kernel size for the convolution. In this procedure, the 
kernel is a polynomial function in 3?^ and the superpixel can be chosen between 
rectangular and exagonal one (see Fig.7). In Figs. 8 and 9 we show the same 
field in different seeing conditions. 


5 Data Processing 

The Data Processing Unit is made of four sub-units: Star Detection and filter¬ 
ing Unit, Cosmic Detection and filtering Unit, Peak Detection Unit, and Peak 
Classification Unit. The cosmic rays unit just switches off the saturated pixels, 
while the peak classification unit splits the curves in function of the number of 
the peaks attributing them to different classes. 


5.1 Star Detection 

This component finds and rejects resolved objects inside the field. After per¬ 
forming several tests, we decided to work in the transformed space. In astro¬ 
nomical image processing the Fourier Transform is often used to select the main 
characteristics or the morphology of bright objects (Pratt 1977, Sedmak et al. 
1981). If we consider the image as a function of two variables f(x, y) G C^, we 
will be able to define a transformed image f{u,v) G C^. In our case f{u,v) is 
obtained by fast Fourier transforming the image f{x, y) 


f{u,v) 


N-lM-1 


NM 




-2«(^ + a) 


a:=0 y—0 


(7) 


where (m, v) are the horizontal and vertical spatial frequencies. The function 
f{u,v) is a complex image in which the high frequencies are clustered at the 
center, while the low ones are located at the edges. Fig. 10 shows the complex 
image f(u,v) corresponding to the real image f{x,y) shown in Fig.2. 

To select and reject the resolved objects - which correspond to low frequency 
signals in the transformed image - we implemented an adaptive high pass filter 
in the transformed space which produces an automatic thresholding in the fre¬ 
quencies domain. In Figs. 11 and 12, we give the scheme of the filter and its 
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effect on the complex image, while in Fig. 13 we show the same image antitrans¬ 
formed and therefore cleaned of the resolved objects. 


5.2 Peak Detection 

The detection of the luminosity variation in the light curve is performed by a 
peak detection algorithm. First of all we evaluate the background level as: 

/ ^ j+a \ 

= min I —1 Vj = 1, ...,n - a + 1, (8) 

where n is the number of points in the light curve (e.g. the number of images), 
a + 1 is the dimension of the window which we use to evaluate (fibkg (a + 1 < n). 
We evaluate the mean value of the flux in a window running on the light curve 
and having a size of a + 1; then we consider as background the minimum of 
the means and then evaluate the maximum standard deviation cr to assess its 
stability. Before the peak detection step, we apply a median hlter on the signal 
in order to estimate the best peak parameter by separating pure signals from 
noise fluctuations. If Y represents the output sequence, id est the filtered data, 
and if Ji represents a subset of the input sequence X centered on the element 
of X 

Ji — — — ^i—1: — j 

if the indexed elements outside the range of X are equal to zero, the function 
gives the elements of Y by using: 


yi = Median {Ji) Vz = 0, 1, 2,..., n — 1, 

where n is the number of elements in the input sequence X, and r is the filter 
rank. The effects of the filter are shown in Fig.14. 

Afterwards we perform a peak detection on the hltered signal with a polynomial 
(usually of the 2“"^^ order is enough) linearly combined, trough a sum operator: 


\^aiX^ + (3^x + (9) 

iei 

where I is the space of peaks and the symbol (jj means a sum of functions at 
different x. For instance, for a 2-modal light curve, we have: 


-\- I3ix -\-ji 

iGl 


aix^ + Pix + 7 i Va; < x* 

a 2 X^ + 132X + 72 Va; > x* 


( 10 ) 


Outputs of this routine are the parameters to be used for the ht of the light 
curve: i.e. the number of peaks, the location (to) and the amplitude. 



6 Data Analysis 

The on-line and off-line are performed in the Data Analysis Unit. Here, the 
statistical and Kolmogorov-Smirnov tests, and the evaluation of a specific 
quality factor are made. The color correlation among light curves is tested in 
different color bands at the end of the process. 


6.1 On-line and Off-line Analysis 

The on-line sub-unit^ can perform basic fits using the parameters derived in 
the peak detection phase by means of the Levenberg-Marquardt algorithm for 
non-linear ht. This sub-unit is fully automated and perform in real time the 
following fits: 

• Paczynski test. We consider magnification A defined as 


^ u^(t)+2 ^ 1 

u{t)^Jv?{t) + 4 u{t) 



( 11 ) 


with uq = tq/Re , where rg is the impact parameter at maximum magni¬ 
fication, tg is the time of maximum magnification, and is the Einstein 
time. 


• Double source test. According to the model by Griest and Hu (1992) 
there will be two tg and two ug and therefore two functions: mi(U) and 
U 2 {ti). Consequently, let (pi and ^2 be the flux of the two parts and the 
flux offset ratio w = then the light magnihcation is given by: 


Abs{ui{t), U 2 {t)) = (1 - uj)A{ui{t)) + UjA{u 2 {t)). 

The light curve is the superposition of two light curves for point-like 
sources, affected by a point-like mass deflector. 

• Extended circular source. This is the case of an optical system 
with a point-like lens and an extended circular source having radius r and 
constant surface brightness. The implementation uses the model proposed 
by Witt and Mao (Witt and Mao 1994). 

MEDEA also contains a set of tools to study more complex events (cf. Di 
Stefano and Mao 1996, Dominik 1998, Dominik 1999). In particular, the events 
which pass the first trigger level but are not selected by the second trigger can be 
analyzed off-line both automatically and manually. The analysis is performed 
by means of the comparison with reference simulated data stored inside the 

^This unit was tested on data collected on the 1.3 meter McGraw-Hill Telescope, at the 
MDM observatory, Kitt Peak (USA) and actually it is in a fine tuning phase for data to be 
collected at the TTl (Toppo Telescope) CastelGrande (Italy). 
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database. The DB events are simulated in order to test the following hypotheses: 
a) double point-like lenses, and single point-like source; b) nova and supernova|^; 
variable source. 

Tools for a traditional analysis (i.e. standard light curves analysis) are im¬ 
plemented too^. 

6.2 Selection of lensing events 

The second trigger consists of two selective procedures: the first one is a sta¬ 
tistical phase, while the second one is specific for microlensing. The and 
Kolmogorov-Smirnov test are implemented to evaluate the quality of the tested 
hypotheses. For each model we consider the x^ test and the Q-factor of Kolmogorov- 
Smirnov test, then a quality factor M is estimated. It is defined as 

M = Qxx^ (12) 

and light curves with M < 1.5 are considered as possible microlensing event^ 
The events, passing the previous tests, are considered for the color test. The 
cross correlation between different colors is estimated as: 

N 

- m)color.l X ” {r))colorJ2 

, (13) 

J E X " m)llor.2 

y n=l 

where the index p is linked with pixel and n with the point along the light curve. 

For C > 0.98 the event is considered a good microlensing candidate. 

7 Summary and conclusions 

MEDEA was specifically tailored to perform automatic microlensing search. 
Most of the tools presented in this paper can be applied in other fields, where 
pipelines and data mining procedures are needed for large amounts of data. 
The procedures implemented in MEDEA environment and presented in the 
Data Acquisition (DAQ), and in the Data Pre-Processing (DAPP) units have 
been tested on simulated data and images, while the Data Processing (DAP) 
and the Data Analysis (DAU) units have been tested on a set of data collected 
at the 1.3 m McGraw-Hill telescope, at MDM observatory - Kitt Peak, in the 

^For novae and supernovae events, a MEDEA subunit was tested on a light curve analyzed 
by SLOTT-AGAPE collaboration (Calchi Novati et al. 2002). The light curve of these events 
are characterized by a very strong and chromatic flux variation, that has been considered as 
due to novae (Modjaz and Li 1999, Riffeser et al. 2001). 

®More details on the off-line analysis and other MEDEA modules will be given in a com¬ 
panion and successive paper. 

^Following the analysis and the results in (Calchi Novati et al. 2002), in order to obtain a 
better confidence on the sample, we also implemented a code to perform the Durbin Watson 
statistical test (Durbin and Watson 1951). 


E 
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period from the end of September to the end of December 1999, by using the 
bulge of Andromeda as target. The results agree with “traditional” pixel leasing 
analysis performed by the AGAPE Collaboration. The flexibility of the system 
allows to perform automatic, semiautomatic or researcher assisted procedures of 
the candidate microlensing events. Common microlensing events due to single 
and double point-like source, or extended source are analyzed on-line and with 
automatic procedures, while more complex events such as double lens, novae, 
supernovae, variable stars or multiple deflectors system (planetary systems), 
double point-like sources and lenses are studied off-line, either with automatic 
procedures (based on the use of a database of simulated events), or individually 
with other tools. In this way, all the advantages of non-automatic procedure are 
kept, and also common events are automatically studied so that the researcher 
has only to control them. The detection of short events or events with huge 
main peaks and secondary ones near the first (as in the case of planetary sys¬ 
tem) becomes possible thanks to the real time light curve monitoring and to 
the dispatcher implementation. In addition, thanks to the fast- and real-time 
analysis, the exact knowledge of the time region in which there is a luminosity 
magnification gives the possibility to use larger telescopes to resolve the pixels 
and obtain more detailed astrophysical information on the magnified star. Fi¬ 
nally the reduction of time to spend for data analysis after data taking is a good 
starting point to perform lensing analysis on large field CCD data obtained in 
the course of specific surveys. 

Acknowledgements: The authors wish to thank SLOTT-AGAPE Collabora¬ 
tion for useful suggestions and comments about lensing, and G.Sedmak for sug¬ 
gestions about the seeing correction. This work was partly sponsored by the 
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Legends to Figures 

Figure 1. The MEDEA flow chart. The dotted part (e.g. Telescope Remote 
Control, Telescope Remote Observing) in the control unit could be very useful 
if one starts to think about remote control and observations. 

Figure 2. ANDROMEDA in a good photometric condition. 

Eigure 3. ANDROMEDA in a not so good photometric condition. 

Figure 4. ANDROMEDA prototype image after photometric alignment. 

Figure 5. Gray level histogram of two images before photometric alignment. 

Figure 6. Gray level histogram of two images after photometric alignment. 

Figure 7. Rectangular and exagonal superpixel frame. 

Figure 8. Image with a good seeing condition (VLT Image of Distant Galax¬ 
ies in AXAF Deep Field by ESO 2000). 

Eigure 9. Image with a worst seeing condition (VLT Image of Distant Galax¬ 
ies in AXAF Deep Field by ESO 2000). 

Figure 10. FFT transformed image of ANDROMEDA. 

Figure 11. The scheme of the filter in the transformed space. 

Figure 12. The Altering effect on ANDROMEDA image in transformed 
space. 

Figure 13. The Altering effect on ANDROMEDA image in anti-transformed 
space (or real). 

Figure 14. Effect of median Alter. 
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